AITopics

2507.01533

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Vu-Quoc, Loc, Humer, Alexander

Partial-differential-algebraic equations of nonlinear dynamics by Physics-Informed Neural-Network: (I) Operator splitting and framework assessment

arXiv.org Artificial IntelligenceAug-7-2024

Several forms for constructing novel physics-informed neural-networks (PINN) for the solution of partial-differential-algebraic equations based on derivative operator splitting are proposed, using the nonlinear Kirchhoff rod as a prototype for demonstration. The open-source DeepXDE is likely the most well documented framework with many examples. Yet, we encountered some pathological problems and proposed novel methods to resolve them. Among these novel methods are the PDE forms, which evolve from the lower-level form with fewer unknown dependent variables to higher-level form with more dependent variables, in addition to those from lower-level forms. Traditionally, the highest-level form, the balance-of-momenta form, is the starting point for (hand) deriving the lowest-level form through a tedious (and error prone) process of successive substitutions. The next step in a finite element method is to discretize the lowest-level form upon forming a weak form and linearization with appropriate interpolation functions, followed by their implementation in a code and testing. The time-consuming tedium in all of these steps could be bypassed by applying the proposed novel PINN directly to the highest-level form. We developed a script based on JAX. While our JAX script did not show the pathological problems of DDE-T (DDE with TensorFlow backend), it is slower than DDE-T. That DDE-T itself being more efficient in higher-level form than in lower-level form makes working directly with higher-level form even more attractive in addition to the advantages mentioned further above. Since coming up with an appropriate learning-rate schedule for a good solution is more art than science, we systematically codified in detail our experience running optimization through a normalization/standardization of the network-training process so readers can reproduce our results.

displacement, remark 5, static solution, (16 more...)

2408.01914

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
North America > United States > Wisconsin > Outagamie County > Appleton (0.04)
North America > United States > New York (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gold, Dara, Rosenberg, Steven

Discretized Gradient Flow for Manifold Learning in the Space of Embeddings

arXiv.org Artificial IntelligenceMay-2-2024

Gradient descent, or negative gradient flow, is a standard technique in optimization to find minima of functions. Many implementations of gradient descent rely on discretized versions, i.e., moving in the gradient direction for a set step size, recomputing the gradient, and continuing. In this paper, we present an approach to manifold learning where gradient descent takes place in the infinite dimensional space $\mathcal{E} = {\rm Emb}(M,\mathbb{R}^N)$ of smooth embeddings $\phi$ of a manifold $M$ into $\mathbb{R}^N$. Implementing a discretized version of gradient descent for $P:\mathcal{E}\to {\mathbb R}$, a penalty function that scores an embedding $\phi \in \mathcal{E}$, requires estimating how far we can move in a fixed direction -- the direction of one gradient step -- before leaving the space of smooth embeddings. Our main result is to give an explicit lower bound for this step length in terms of the Riemannian geometry of $\phi(M)$. In particular, we consider the case when the gradient of $P$ is pointwise normal to the embedded manifold $\phi(M)$. We prove this case arises when $P$ is invariant under diffeomorphisms of $M$, a natural condition in manifold learning.

gradient flow, manifold, theorem 5, (13 more...)

1901.09057

Country:

North America > United States > New York (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Education (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Álvarez-López, Antonio, Orive-Illera, Rafael, Zuazua, Enrique

Optimized classification with neural ODEs via separability

arXiv.org Artificial IntelligenceDec-21-2023

Classification of $N$ points becomes a simultaneous control problem when viewed through the lens of neural ordinary differential equations (neural ODEs), which represent the time-continuous limit of residual networks. For the narrow model, with one neuron per hidden layer, it has been shown that the task can be achieved using $O(N)$ neurons. In this study, we focus on estimating the number of neurons required for efficient cluster-based classification, particularly in the worst-case scenario where points are independently and uniformly distributed in $[0,1]^d$. Our analysis provides a novel method for quantifying the probability of requiring fewer than $O(N)$ neurons, emphasizing the asymptotic behavior as both $d$ and $N$ increase. Additionally, under the sole assumption that the data are in general position, we propose a new constructive algorithm that simultaneously classifies clusters of $d$ points from any initial configuration, effectively reducing the maximal complexity to $O(N/d)$ neurons.

general position, hyperplane, neural ode, (13 more...)

2312.13807

Country:

Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Sridhar, Anirudh, Kar, Soummya

Mean-field Approximations for Stochastic Population Processes with Heterogeneous Interactions

arXiv.org Artificial IntelligenceJul-19-2023

This paper studies a general class of stochastic population processes in which agents interact with one another over a network. Agents update their behaviors in a random and decentralized manner according to a policy that depends only on the agent's current state and an estimate of the macroscopic population state, given by a weighted average of the neighboring states. When the number of agents is large and the network is a complete graph (has all-to-all information access), the macroscopic behavior of the population can be well-approximated by a set of deterministic differential equations called a {\it mean-field approximation}. For incomplete networks such characterizations remained previously unclear, i.e., in general whether a suitable mean-field approximation exists for the macroscopic behavior of the population. The paper addresses this gap by establishing a generic theory describing when various mean-field approximations are accurate for \emph{arbitrary} interaction structures. Our results are threefold. Letting $W$ be the matrix describing agent interactions, we first show that a simple mean-field approximation that incorrectly assumes a homogeneous interaction structure is accurate provided $W$ has a large spectral gap. Second, we show that a more complex mean-field approximation which takes into account agent interactions is accurate as long as the Frobenius norm of $W$ is small. Finally, we compare the predictions of the two mean-field approximations through simulations, highlighting cases where using mean-field approximations that assume a homogeneous interaction structure can lead to inaccurate qualitative and quantitative predictions.

approximation, artificial intelligence, theorem 5, (15 more...)

2101.09644

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.87)

Bastounis, Alexander, Hansen, Anders C, Vlačić, Verner

The mathematics of adversarial attacks in AI -- Why deep learning is unstable despite the existence of stable neural networks

arXiv.org Machine LearningSep-13-2021

The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused an enormous research effort -- with a vast literature on so-called adversarial attacks -- yet there has been no solution to the problem. Our paper addresses why there has been no solution to the problem, as we prove the following mathematical paradox: any training procedure based on training neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) -- despite the provable existence of both accurate and stable neural networks for the same classification problems. The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability. Our result points towards the paradox that accurate and stable neural networks exist, however, modern algorithms do not compute them. This yields the question: if the existence of neural networks with desirable properties can be proven, can one also find algorithms that compute them? There are cases in mathematics where provable existence implies computability, but will this be the case for neural networks? The contrary is true, as we demonstrate how neural networks can provably exist as approximate minimisers to standard optimisation problems with standard cost functions, however, no randomised algorithm can compute them with probability better than 1/2.

algorithm, neural network, theorem 2, (16 more...)

2109.06098

Country:

North America > United States (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology > Security & Privacy (0.71)
Government > Military (0.71)
Government > Regional Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Bou-Rabee, Nawaf, Eberle, Andreas

Couplings for Andersen Dynamics

arXiv.org Machine LearningSep-29-2020

Abstract: Andersen dynamics is a standard method for molecular simulations, and a precursor of the Hamiltonian Monte Carlo algorithm used in MCMC inference. The stochastic process corresponding to Andersen dynamics is a PDMP (piecewise deterministic Markov process) that iterates between Hamiltonian flows and velocity randomizations of randomly selected particles. Both from the viewpoint of molecular dynamics and MCMC inference, a basic question is to understand the convergence to equilibrium of this PDMP particularly in high dimension. Here we present couplings to obtain sharp convergence bounds in the Wasserstein sense that do not require global convexity of the underlying potential energy. October 1, 2020 1. Introduction A common task in molecular dynamics is to simulate a molecular system at a specified temperature [3, 27].

andersen dynamic, artificial intelligence, machine learning, (15 more...)

2009.14239

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New Jersey > Camden County > Camden (0.04)
Europe > United Kingdom > Wales (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Hien, Le Thi Khanh, Gillis, Nicolas, Patrinos, Panagiotis

Inertial Block Mirror Descent Method for Non-Convex Non-Smooth Optimization

arXiv.org Machine LearningMar-5-2019

In this paper, we propose inertial versions of block coordinate descent methods for solving non-convex non-smooth composite optimization problems. We use the general framework of Bregman distance functions to compute the proximal maps. Our method not only allows using two different extrapolation points to evaluate gradients and adding the inertial force, but also takes advantage of randomly picking the block of variables to update. Moreover, our method does not require a restarting step, and as such, it is not a monotonically decreasing method. To prove the convergence of the whole generated sequence to a critical point, we modify the convergence proof recipe of Bolte, Sabach and Teboulle (Proximal alternating linearized minimization for non-convex and non-smooth problems, Math. Prog. 146(1):459--494, 2014), and combine it with auxiliary functions. We deploy the proposed methods to solve non-negative matrix factorization (NMF) problems and show that they compete favourably with the state-of-the-art NMF algorithms.

algorithm, artificial intelligence, machine learning, (18 more...)

1903.01818

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Russia (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Shang, Zuofeng, Cheng, Guang

Local and global asymptotic inference in smoothing spline models

arXiv.org Machine LearningNov-26-2013

This article studies local and global inference for smoothing spline estimation in a unified asymptotic framework. We first introduce a new technical tool called functional Bahadur representation, which significantly generalizes the traditional Bahadur representation in parametric models, that is, Bahadur [Ann. Inst. Statist. Math. 37 (1966) 577-580]. Equipped with this tool, we develop four interconnected procedures for inference: (i) pointwise confidence interval; (ii) local likelihood ratio testing; (iii) simultaneous confidence band; (iv) global likelihood ratio testing. In particular, our confidence intervals are proved to be asymptotically valid at any point in the support, and they are shorter on average than the Bayesian confidence intervals proposed by Wahba [J. R. Stat. Soc. Ser. B Stat. Methodol. 45 (1983) 133-150] and Nychka [J. Amer. Statist. Assoc. 83 (1988) 1134-1143]. We also discuss a version of the Wilks phenomenon arising from local/global likelihood ratio testing. It is also worth noting that our simultaneous confidence bands are the first ones applicable to general quasi-likelihood models. Furthermore, issues relating to optimality and efficiency are carefully addressed. As a by-product, we discover a surprising relationship between periodic and nonperiodic smoothing splines in terms of inference.

artificial intelligence, assumption, machine learning, (15 more...)

doi: 10.1214/13-AOS1164

1212.6788

Country: North America > United States > Indiana (0.46)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)